A Naïve Bayes Approach for Word Sense Disambiguation
نویسنده
چکیده
The word sense disambiguation (WSD) is the task ofautomatically selecting the correct sense given a context and it helps in solving many ambiguity problems inherently existing in all natural languages.Statistical Natural Language Processing (NLP),which is based on probabilistic, stochastic and statistical methods, has been used to solve many NLP problems.The Naive Bayes algorithm which is one of the supervised learning techniques has worked well in many classification problems. In the present work, WSD task to disambiguate the senses of different words from the standard corpora available in the “1998 SENSEVAL Word Sense Disambiguation (WSD) shared task” is performed by applying Naïve Bayes machine learning technique. It is observed that senses of ambiguous word having lesser number of part-of-speeches are disambiguated more correctly. Other key observation is that with lesser number of senses to be disambiguated, the chances of words being disambiguated with correct senses are more. Keywords— Word sense disambiguation, WSD, POS-filtering, ambiguity, Naïve Bayes, supervised learning
منابع مشابه
Naïve Bayes Classifier for Arabic Word Sense Disambiguation
Word Sense Disambiguation (WSD) is the process of selecting a sense of an ambiguous word in a given context from a set of predefined senses. Sense Inventory usually comes from a dictionary or thesaurus. In Arabic, the main cause of word ambiguity is the lack of diacritics of the most digital documents so the same word can occurs with different senses. In this paper, we use the rooting algorithm...
متن کاملExemplar-Based Word Sense Disambiguation" Some Recent Improvements
In this paper, we report recent improvements to the exemplar-based learning approach for word sense disambiguation that have achieved higher disambiguation accuracy. By using a larger value of k, the number of nearest neighbors to use for determining the class of a test example, and through 10-fold cross validation to automatically determine the best k, we have obtained improved disambiguation ...
متن کاملImproving Word Sense Disambiguation Using Topic Features
This paper presents a novel approach for exploiting the global context for the task of word sense disambiguation (WSD). This is done by using topic features constructed using the latent dirichlet allocation (LDA) algorithm on unlabeled data. The features are incorporated into a modified naı̈ve Bayes network alongside other features such as part-of-speech of neighboring words, single words in the...
متن کاملRaw Corpus Word Sense Disambiguation
A wide range of approaches have been applied to word sense disambiguation. However, most require manually crafted knowledge such as annotated text, machine readable dictionaries or thesari, semantic networks, or aligned bilingual corpora. The reliance on these knowledge sources limits portability since they generally exist only for selected domains and languages. This poster presents a corpus-b...
متن کاملA Simple Approach to Building Ensembles of Naive Bayesian Classi ers for Word Sense Disambiguation
This paper presents a corpus-based approach to word sense disambiguation that builds an ensemble of Naive Bayesian classi ers, each of which is based on lexical features that represent co{occurring words in varying sized windows of context. Despite the simplicity of this approach, empirical results disambiguating the widely studied nouns line and interest show that such an ensemble achieves acc...
متن کامل